Analysis of Partitioning Models and Metrics in Parallel Sparse Matrix-Vector Multiplication
نویسندگان
چکیده
Graph/hypergraph partitioning models and methods have been successfully used to minimize the communication requirements among processors in several parallel computing applications. Parallel sparse matrix-vector multiplication (SpMxV) is one of the representative applications that renders these models and methods indispensable in many scientific computing contexts. We investigate the interplay of several partitioning metrics and execution times of SpMxV implementations in three libraries: Trilinos, PETSc, and an in-house one. We design and carry out experiments with up to 512 processors and investigate the results with regression analysis. Our experiments show that the partitioning metrics, although not an exact measure of communication cost, influence the performance greatly in a distributed memory setting. The regression analyses demonstrate which metric is the most influential for the execution time of the three libraries used. Key-words: Parallel sparse-matrix vector multiplication, hypergraph partitioning ∗ Dept. of Biomedical Informatics, The Ohio State University (kamer, [email protected]) † Dept. of Electrical & Computer Engineering, The Ohio State University ‡ CNRS and LIP, ENS Lyon, France ([email protected]) Sur l’analyse des modèles de partitionnement et des métriques lors de la multiplication d’une matrice creuse avec un vecteur dense en parallèle Résumé : Les modèles et méthodes de partitionnement de graphes/hypergraphes ont été utilisés avec succès pour minimiser les besoins de communication entre processeurs dans de nombreuses applications de calcul parallèle. La multiplication d’une matrice creuse avec un vecteur dense (SpMxV) est une des applications représentatives qui rendent ces modèles et méthodes indispensables dans de nombreux contextes de calcul scientifique. Nous nous intéressons aux interactions entre plusieurs métriques de partitionnement et le temps d’exécution des implémentations de SpMxV dans les bibliothèques Trilinos, PETSc, et une bibliothèque à nous. Nous effectuons des expériences avec jusqu’à 512 processeurs, et nous étudions les résultats à l’aide d’analyse par régression. Nos expériences montrent que les métriques de partitionnement, bien qu’elles ne fournissent pas une mesure exacte du coût de communication, influent de façon significative sur la performance dans un système à mémoire distribuée. Les analyses par régression montrent quelle métrique a le plus d’influence sur le temps d’exécution pour les trois bibliothèques utilisées. Mots-clés : Multiplication d’une matrice creuse avec un vecteur en parallèle, partitionnement d’hypergraphes Partitioning for Parallel Sparse Matrix-Vector Multiplication 3
منابع مشابه
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication
ÐIn this work, we show that the standard graph-partitioning-based decomposition of sparse matrices does not reflect the actual communication volume requirement for parallel matrix-vector multiplication. We propose two computational hypergraph models which avoid this crucial deficiency of the graph model. The proposed models reduce the decomposition problem to the well-known hypergraph partition...
متن کاملReducing latency cost in 2D sparse matrix partitioning models
Sparse matrix partitioning is a common technique used for improving performance of parallel linear iterative solvers. Compared to solvers used for symmetric linear systems, solvers for nonsymmetric systems offer more potential for addressing different multiple communication metrics due to the flexibility of adopting different partitions on the input and output vectors of sparse matrix-vector mu...
متن کاملHypergraph Models for Sparse Matrix Partitioning and Reordering
HYPERGRAPH MODELS FOR SPARSE MATRIX PARTITIONING AND REORDERING Umit V. C ataly urek Ph.D. in Computer Engineering and Information Science Supervisor: Assoc. Prof. Cevdet Aykanat November, 1999 Graphs have been widely used to represent sparse matrices for various scienti c applications including one-dimensional (1D) decomposition of sparse matrices for parallel sparse-matrix vector multiplic...
متن کاملImproving performance of sparse matrix dense matrix multiplication on large-scale parallel systems
We propose a comprehensive and generic framework to minimize multiple and different volume-based communication cost metrics for sparse matrix dense matrix multiplication (SpMM). SpMM is an important kernel that finds application in computational linear algebra and big data analytics. On distributed memory systems, this kernel is usually characterized with its high communication volume requireme...
متن کاملConstrained Fine-Grain Parallel Sparse Matrix Distribution
We consider how to distribute sparse matrices among processors to reduce communication cost in parallel sparse matrix computations, in particular, sparse matrix-vector multiplication. We allow 2d distributions, where the distribution (partitioning) is not constrained to just rows or columns. The fine-grain model is a 2d distribution introduced in [2] where nonzeros can be assigned to processors...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013